Data Load

Data Encoding

EDA

Features Importance

Features Selection

Pre-clustering tSNE and PCA

Interpretation

In PCA, the data points seem to form a dense cluster which are indicating that customers might be relatively homogeneous. The spread of points along the axes shows that the directions of maximum variance. highlight features that differentiate customers but here the clustering doesn't reveal distinct groups. T-Sne is similar to PCA, the data points are densely packed it means a lack of clear clusters but it revealing local relationships between data points. While there isn't a strong global structure there might be subtle local clusters within the dataset.
Both plots demonstrate a lack of distinct clusters suggesting a relatively homogeneous customer base. This homogeneity could be attributed to factors like similar demographics, shared shopping habits etc.
so we can say that PCA focusing on global structure where it shows a dense cluster of data points while t-SNE emphasizing local relationships also highlights a lack of clear separations.

Hierarchical clustering (Agglomerative)

Dendrogram Interpretation

- Dendrogram shows how data points are grouped together based on their similarity
- here we can identify the optimal number of clusters (k) by looking for a significant gap . In dendrogram, cutting the tree at a distance of around 60 seems to yield six well defined clusters. We can see talest gaps in the dendrogram at this level which suggesting separation between the clusters.

Post-clustering tSNE and PCA

Post-clustering tSNE and PCA Interpretations

- The PCA plot shows distinct clusters that the hierarchical clustering algorithm has effectively separated customers into groups with similar characteristics.
- features captured by the principal components are relevant for differentiating customer segments
- The density of points within each cluster provide insights into the homogeneity of the cluster.

- t-SNE plot also shows distinct clusters confirming the effectiveness of the hierarchical clustering. - t-SNE showing non-linear relationships between data points. The clusters in the t-SNE plot exhibit complex, non-linear structures that are not apparent in the PCA plot.
- separation of clusters in both plots suggests that the hierarchical clustering algorithm has successfully identified meaningful groups of customers based on their shared characteristics.

Radar Chart

Interpretation

Radar chart visualize the average values of the three features (Annual Income, Age, and Spending Score) for each of the six clusters identified

Comparison Kmean with Hierarchical Clustermering

K-Means Cluster Centers Radar Chart shows the average values of the three features (Annual Income, Age, and Spending Score) for each of the six clusters identified through K-Means clustering. the chart reveals significant overlap between the clusters, indicating that they are not well-separated and may not represent distinct customer segments.